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Multi-mode Transmission for the MIMO 
Broadcast Channel with Imperfect Channel 

State Information 

Jun Zhang, Marios Kountouris, Jeffrey G. Andrews, and Robert W. Heath Jr. 



Abstract 

This paper proposes an adaptive multi-mode transmission strategy to improve the spectral efficiency 
achieved in the multiple-input multiple-output (MIMO) broadcast channel with delayed and quantized 
tyj ' channel state information. The adaptive strategy adjusts the number of active users, denoted as the 

transmission mode, to balance transmit array gain, spatial division multiplexing gain, and residual inter- 
ns) . user interference. Accurate closed-form approximations are derived for the achievable rates for different 
> 

QQ ' modes, which help identify the active mode that maximizes the average sum throughput for given 

' feedback delay and channel quantization error. The proposed transmission strategy is combined with 

\ round-robin scheduling, and is shown to provide throughput gain over single-user MIMO at moderate 

signal-to-noise ratio. It only requires feedback of instantaneous channel state information from a small 
number of users. With a feedback load constraint, the proposed algorithm provides performance close 
to that achieved by opportunistic scheduling with instantaneous feedback from a large number of users. 



Index Terms 

MIMO systems, space division multiplexing, broadcast channels, feedback, delay effects. 



I. Introduction 

Multi-user MIMO (MU-MIMO), which allows the base station (BS) to communicate with 
multiple mobile users simultaneously, provides spatial division multiplexing gainQ proportional 
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'in this paper, we use the spatial division multiplexing gain to denote the spatial multiplexing gain provided by MU-MIMO 
to differentiate it from that provided by the point-to-point MIMO. 
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to the number of antennas (Nt) at the BS even with single- antenna mobiles [1], i.e. the sum 
throughput grows linearly with A^^ at high SNR. To exploit the full spatial division multiplexing 
gain in the MIMO broadcast channel (MIMO-BC), channel state information at the transmitter 
(CSIT) is required to separate the spatial channels for different users. CSIT, however, is difficult 
to obtain and is never perfect. Imperfect CSIT causes residual inter-user interference, which 
degrades the performance of the MIMO-BC [2]. The capacity region of the MIMO-BC with 
imperfect CSIT has not been fully discovered. Although not capacity achieving, linear precoding 
is a practical transmission technique for the MIMO-BC; it is able to provide full spatial division 
multiplexing gain given CSIT [3], [4]. 

Limited feedback, in which users feed back quantized channel information through a feedback 
channel, is an efficient way to provide partial CSIT for the MIMO-BC [5]. It has been shown 
that the full spatial division multiplexing gain in the MIMO-BC can be obtained with carefully 
designed feedback strategies and a feedback rate that grows linearly with SNR (in dB) and the 
number of transmit antennas [6]-[9]. Therefore, linear precoding combined with limited feedback 
provides a feasible way to exploit the spatial division multiplexing gain in the MIMO-BC. 

A. Related Work 

In existing commercial wireless communication systems, the number of feedback bits for each 
user is fixed and cannot be adjusted with SNR. In addition, there are other CSIT imperfections, 
such as estimation error and feedback delay, all of which make the system throughput-limited 
at high SNR due to the residual inter-user interference [6], [9]. One approach to improve the 
CSIT accuracy in limited feedback systems is to employ multiuser diversity, as proposed in [7]. 
To reduce feedback overhead in such systems, threshold based feedback can be adopted [10]. 
The CSIT accuracy of limited feedback can also be improved with a progressive refinement of 
precoding vectors by taking advantage of the temporal channel coherence [11]. An alternative 
approach to deal with imperfect CSIT is adaptively switching between the single-user (SU) and 
multi-user (MU) mode^ as the SU mode does not suffer from residual interference at high 
SNR. SU/MU mode switching algorithms for the random orthogonal beamforming system were 
proposed in [14], [15], where each user feeds back its preferred mode and the channel quality 
information (CQI). For the MIMO downlink with the number of receive antennas greater than 
or equal to the number of transmit antennas, an adaptive SU/MU mode switching algorithm was 
proposed in [16], which also considered correlation at transmit antennas. 



^Note that the term "mode" used in this paper denotes the number of active users rather than different MIMO transmission 
tectmiques, as spatial multiplexing/diversity mode [12], or different number of data streams for a given user, as in [13]. 
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Most prior mode switching algorithms were based on instantaneous CSIT; no expUcit mode 
switching point has been obtained that is related to the key system parameters. An alternative 
approach is to switch based on the channel distribution information. In [17], the number of 
active users for a limited-feedback zero-forcing (ZF) precoding system was optimized through 
asymptotic analysis to maximize the spectral efficiency, based on average SNR and channel 
quantization codebook size. In [18], a SU/MU mode switching algorithm was proposed for 
the ZF precoding system considering delayed and quantized CSIT. The mode switching point 
can be explicitly derived based on the parameters including average SNR, normalized Doppler 
frequency, and codebook size, which are computable at the BS. 

Both switching algorithms in [17] and [18] require only the selected users to feed back their 
instantaneous channel information, which is desirable for systems with a large number of users. 
The technique in [18] is based on non-asymptotic analysis, and thus can better characterize the 
system behavior with different system parameters, e.g. the operating regions for both SU and 
MU modes can be determined for different delay/mobility or different codebook sizes. However, 
it only switches between the SU mode and the MU mode serving as many users as transmit 
antennas. It neglects the impact of the number of active users on transmit array gain, spatial 
division multiplexing gain, and residual inter-user interference. 

B. Contributions 

In this paper, we propose a general multi-mode switching algorithm that adaptively selects 
the number of active users in a MIMO-BC considering delayed and quantized CSIT. We derive 
accurate closed-form approximations for the achievable rates for different modes, based on which 
the mode that provides the highest throughput can be selected. Such multi-mode transmission 
(MMT) strategy improves the spectral efficiency by balancing transmit array gain, spatial division 
multiplexing gain, and residual interference. 

The proposed MMT strategy can be combined with channel-independent scheduling algo- 
rithms, such as round-robin scheduling, to serve a large number of users. In this way, the 
scheduling is of low complexity, and it greatly reduces the amount of feedback as only the pre- 
selected users need to feed back their instantaneous channel information for the precoder design. 
As instantaneous CSIT is not exploited, the proposed algorithm cannot provide multiuser diversity 
gain, but it is still able to provide a throughput gain over SU-MIMO transmission and can serve 
multiple users simultaneously. In addition, in certain scenarios such as with a feedback overhead 
constraint it is able to provide performance close to that achieved by opportunistic scheduling 
based on instantaneous CSI feedback from a large number of users. 
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Based on analytical and numerical results, we provide the following insights and guidelines 
for the design of MIMO transmission in the broadcast channel with imperfect CSIT: 

1) The SU mode should be applied at both low and high SNRs; at medium SNR, the MU 
mode can be used with the number of active users adaptively adjusted. The full MU 
mode that serves Nt users normally should not be activated, as it has the highest residual 
interference and no array gain. 

2) For practical systems such as 3GPP LTE, the MU mode should only be used with low 
mobility and the number of feedback bits should be increased or advanced feedback 
techniques such as progressive refinement [11] should be employed. A short frame length 
is desirable for MU-MIMO, as it is related to the amount of CSIT delay. 

3) Considering delay-mobility and channel quantization, our analysis can determine which 
source dominates the CSIT imperfection. For given delay-mobility, we should provide a 
sufficiently large codebook size, as otherwise channel quantization error will dominate the 
performance; on the other hand, there is no need for a too large codebook size, as other 
CSIT imperfections such as delay-mobility start to dominate the performance. 

4) For a fixed transmission mode, MMSE precoding outperforms ZF precoding. With im- 
perfect CSIT, both precoding schemes require multi-mode transmission to improve the 
achievable rate, and they provide close performance. As ZF precoding with MMT requires 
less feedback and its preferred mode can be easily calculated, it is preferred to MMSE 
precoding. 

5) If there is a constraint on the total number of feedback bits, high-quality feedback from 
less than A^^ users together with mode selection provides good performance compared to 
instantaneous feedback from a large number of users. 

Compared to our previous study in [18], this paper has made important extensions. First, 
we have explicitly considered the impact of the number of active users on the performance of 
MU-MIMO, while [18] only considered the full MU mode with Nt users. Second, the proposed 
MMT strategy can be easily extended to serve a large number of users and provide a significant 
throughput gain, while the dual-mode switching in [18] is inflexible and is confined in a system 
with Nt users. Third, the analysis and simulation results in this paper provide useful insights for 
the practical system design, while [18] emphasized analysis. 

Organization: The rest of the paper is organized as follows: the system model and some 
assumptions are presented in Section UIl In Section Ulll closed-form approximations are derived 
for the average achievable rates for different modes. User scheduling based on MMT is proposed 
in Section |IVl Numerical results and conclusions are in Section |V] and |VIl respectively. 
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Notation: In this paper, we use uppercase boldface letters for matrices (X) and lowercase 
boldface for vectors (x). E[ ] is the expectation operator. The conjugate transpose of a matrix 
X (vector x) is X* (x*). Similarly, X^ denotes the pseudo-inverse, x denotes the normalized 
vector of x, i.e. x = p|y, and x denotes the quantized vector of x. 

II. System Model 

We consider a MIMO-BC with Nt antennas at the transmitter and U single-antenna mobiles. 
The discrete-time complex baseband received signal at the u-\h user at time n is given as 

M 

yu[n\ = h;[n] ^ f^[n]x^[n] + zjji], (1) 

v=l 

where M is the active mode, i.e. the number of active users, M = 1,2, ■■ • ,Nt, h„[n] is the 
channel vector for the u-\h user, h„[n] e C^^*^^^, and Zu[n] is normalized complex additive 
Gaussian noise, Zu[n\ ~ CJ\f{0, 1). and fu[n\ are the transmit signal and precoding vector 

for the u-th user, and f„[n] e C^^*^^\ The transmit power constraint is E '^^=1 k«NP] — 
and we assume equal power allocation among different users. As the noise is normalized, P is 
also the average SNR. Eigen-beamforming is applied for the SU mode (M = 1), which transmits 
along the channel direction and is optimal for SU-MIMO with perfect CSIT [19]. ZF precoding 
is used for the MU mode (1 < M < Nt), as it is possible to derive closed-form results due to its 
simple structure, and it is optimal among the set of all linear precoders at asymptotically high 
SNR [4]. 

To assist the analysis, we assume that the channel hu[n] is well modeled as a spatially white 
Gaussian channel, with entries hi[n\ ~ CJ\f{0, 1). The investigation of other channel models is 
left to future work. We assume perfect channel state information (CSI) at the receiver, while the 
transmitter obtains CSI through limited feedback from the receiver. In addition, there is delay in 
the available CSIT. The models of the CSI delay and limited feedback are presented as follows. 

A. CSI Delay Model 

We consider a stationary ergodic Gauss-Markov block fading regular process (or auto regres- 
sive model of order 1) [20, Sec. 16-1], where the channel remains constant for a symbol duration 
and changes from symbol to symbol according to 

h[n] = ph[n-l]+e[n], (2) 

where e[n] is the channel error vector, with independent and identically-distributed (i.i.d.) entries 
ei[n\ ~ CA/'(0,eg), uncorrelated with h[n — 1], and i.i.d. in time. We assume the CSI delay is 
one symbol period, but the results can be easily extended to the case with delays of multiple 
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symbols. For the numerical analysis, we use a Gauss-Markov "fit" of the classical Clarke's 
isotropic scattering model [21], as the true Clarke's model is complicated for analysis. Then 
the correlation coefficient is p = Jo(27rfdTs) with Doppler spread fd, where Tg is the symbol 
duration and Jq(-) is the zero-th order Bessel function of the first kind. The variance of the error 
vector is eg = 1 — p^. The value /^T, is the normalized Doppler frequency. As shown in [8], 
the behaviors are different for the Clarke's and Gauss-Markov model, especially at high SNR. 
However, the impact of channel quantization still persists and dominates at high SNR, so the 
conclusion in the paper still holds. In addition, the analysis for the Gauss-Markov model can be 
extended for the estimation or prediction error, which makes our results more general. 

B. Channel Quantization 

The channel direction information is fed back using a quantization codebook known at both 
the transmitter and receiver. The quantization is chosen from a codebook of unit norm vectors of 
size L = 2^. Each user is assumed to have a different codebook to avoid users sharing the same 
quantization vectoj^. The codebook for user n is = {c„ i, c„ 2, • ■ ■ > c„,l}. Each user quantizes 
its channel direction to the closest codeword, measured by the inner product. Therefore, the index 
of channel for user u is 

lu = arg max |h*c„,£|, (3) 

i<e<L 

where h„ = is the channel direction. Random vector quantization (RVQ) [6], [22] is 
used to facilitate the analysis, where each quantization vector is independently chosen from the 
isotropic distribution on the A/^^ -dimensional unit sphere. We analyze performance averaged over 
realizations of all such random codebooks in addition to averaging over the fading distribution. 
It was shown in [23] that the RVQ-based codebook is asymptotically optimal in probability as 
Nt, B oo, with keeping a constant. 

Let cos6'u = |h*h„|, where 9^ = ^ ^hu,h„j, then we have [24] 

a = Ee„ [cos2 9^]=1- L,,, ■ (3 (^2^, , (4) 

where l3{x,y) is the Beta function, i.e. I3{x,y) = ^^^^^^^y-^ with r(x) = t^~^e^*dt as the 
Gamma function. 

low-complexity method to generate different codebooks for different users is to first generate a common codebook, and 
then the codebook for each user is generated tlirough random unitary rotation of this common codebook 



7 



III. Throughput Analysis and Mode Selection 

In this section, we derive average achievable rates for different transmission modes. It is shown 
that the number of active users to maximize the sum throughput is closely related to transmit array 
gain, spatial division multiplexing gain, and residual inter-user interference. MMT is proposed to 
adaptively select the active mode to balance between these effects and maximize the throughput. 
For a selected mode M*, M* users will be selected based on channel-independent scheduling 
algorithms such as round-robin scheduling. The scheduling algorithm will be further discussed 
in Section I 



A. Perfect CSIT 

We first consider the system with perfect CSIT, which serves as the basis for the analysis of 
the impact of imperfect CSIT. 

1) SU-MIMO (Eigen-beamforming), M = 1: With perfect CSIT, the beamforming (BF) vector 
is the channel direction, i.e. f [n] = h[n]. The average throughput is the same as that of a maximal 
ratio combining diversity system, given in [25] as 

RcsiAl) = RBFb,Nt) ^ Eh [log2 (1 + 7|h*HfHp)] 

1 r ^ 1/7 V^^ 1/7) 
= log2(e)e , (5) 

A;=0 ^ 

where r(a, x) = t°'~^e~^dt is the complementary incomplete gamma function, and Rbf{i, 
is the rate function for the diversity system with SNR 7 and diversity order n. The BF system 
provides transmit array gain A^^ as Eh [7|h*[?2]f [njp] = Nt'j. The array gain is defined as the 
increase in the average combined SNR over the average SNR on each branch [26]. 

2) MU-MIMO (Zero-forcing), I < M < Nt: The received SINR for the u-th user in a linear 
precoding MU-MIMO system in mode M is given by 

Denote li[n] = [hi[?7,], h2[ra], ■ ■ ■ , h[/[n]]*, and the pseudo-inverse of ii[n] as F[n] = li^[n] = 
H*[n](H[n]H*[n])~^. The ZF precoding vector for the u-th user is obtained by normalizing the 
u-th column of F[n]. Therefore, h*[n]f^[r2] = 0, ^u ^ v, i.e. there is no inter-user interference 
and each user gets an equivalent interference-free channel. The SINR for the u-th user becomes 

SlNRz F,u{M) = ]^\K[n]L[n]\'. (7) 

Due to the isotropic nature of i.i.d. Rayleigh fading, such orthogonality constraints to precancel 
inter-user interference consume M — 1 degrees of freedom at the transmitter. As a result, the 
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effective channel gain of each parallel channel is a chi-square random variable with 2{Nt — M+l) 
degrees of freedom [4], [27], i.e. |h*[n]fu[n]p ~ xl(Nt-M+i)- Therefore, the channel for each 
user is equivalent to a diversity channel with order (Nt — M + 1) and effective SNR j^. The 
average achievable rate for the u-th user in mode M is 

RcsitAM) = RBF{j^,Nt-M + l), (8) 

The average achievable sum rate for the ZF system of mode M is 

M 

RcsiAM) = J2 RcsitAM) 



u=l 

(a) 



MRBF[j^,Nt-M+iy (9) 



The equality (a) follows the homogeneous nature of the network. When M = 1, this reduces to 

3) Mode Selection: From ©, the system in mode M provides a spatial division multiplexing 
gain of M and an array gain of {Nt — M + 1) for each user. As M increases, the achievable 
spatial division multiplexing gain increases but the array gain decreases. Therefore, there is a 
tradeoff between the achievable array gain and the spatial division multiplexing gain. From 
the mode that achieves the highest throughput for the given average SNR can be determined as 

M* = arg max RcsiriM). (10) 

1 < M < Nt 

Note that this is a very simple optimization problem, as only Nt values need to be computed 
and compared. This transmission strategy that adaptively adjusts the number of active users is 
denoted as multi-mode transmission (MMT). 

B. Imperfect CSIT 

In this section, we consider imperfect CSIT, including both delay and channel quantization. 
As it is difficult to derive the exact achievable rate for such a system, we provide accurate 
closed-form approximations for mode selection. The average achievable rate for the SU mode 
M = 1 in such a system is provided in [18]. 

For the MU mode with delay and quantization, the precoding vectors are designed based on the 
quantized channel directions with delay, which achieve h*[ri — l]fi*^^'*[?2] = 0, Vm 7^ t;. Letters 
'Q' and 'D' denote parameters of the system with channel quantization (limited feedback) and 
delay, respectively. The SINR for the u-\h user in mode M is given as 
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We assume the mobile users can perfectly estimate the noise and interference and feed back 
these information to the transmitter, so the achievable rate for the u-th user is given as 

log2(l + 7S(M))" 



RqdAM) = 

The same metric is used in [6], [9], [18]. 

We first analyze the signal term and the interference term in (fTTI) . 
1) Signal term: First, we approximate the signal term as 



(12) 



P. 



P 

M 



p 

M 



|(p„h„[n-l] + e„M)*f(«^)Mp 



-^|p«h:[n-i]fr)Nr 

_ P 
~ M 

M 
(c) P 
^ M 



pJ|h.[n-l]||.h:[n-l]f(«^)M 

pl\\K[n - IW ■ |(cos0„h„[n - 1] + sin^^„g,„[n - l])*f(«^)M 



(13) 



where step (a) removes e* [n]fi'^^''[?2], which is normally very small compared with the remaining 
term. In step (b), we write h.u[n — 1] = (cos6'u)hu[n — 1] + [sni9u)g,u\n — 1], where 9^ = 
Z (h^ln — l],hu[n — l]j and gu[r?,— 1] is orthogonal to h„[r2— 1]. Step (c) approximates the actual 
channel direction by the quantized version, which is justified for small quantization error, and 
approximates cos 9^ by its expectation ^u, with given in dH). As the quantized channel direction 
hu[ra — 1] is independent of each other, and fi'^^\n] is designed to lie in the nuUspace of ht,[r2 — 1], 



(QD) 



n\ 



\/v 7^ u, similar to the case of perfect CSIT, \\hu[n — 1] 
So as the perfect CSIT case it also provides an array gain of Nt — M + 1 for each user 
2) Interference Term: The residual interference term can be approximated as 



'y^2{Nt-M+iy 



p 

M 
P 



5^|(p„h4n-l] + e„M)*f(«^) 



n\ 



^ Ep« Kin - i]fr)Mf + KWfi^^^t^ 



n\ 



(14) 



The approximation for the denominator comes from removing the terms with both e„[ri] and 



AQD) 



n]. As shown in [18], the interference term due to quantization, |h*[n — l]fi*^^''[n]p, can be 

B 

well approximated as an exponential random variable with mean 6 = 2 ^t-i- , The interference 
term due to delay, also an exponential random variable with mean e^^ as 
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Based on the analysis and following the results in [6], [8], we can derive an upper bound for 
the rate loss for the u-th user due to imperfect CSIT, stated in the following theorem. 

Theorem 1 (Rate loss): The rate loss for the u-th user with imperfect CSIT compared to that 
with perfect CSIT is upper bounded by 

RCSIT,U - RqD,u < log2 A(«^), (15) 

where aI*^^-* is the average noise plus residual interference, given by 

Ar^ = E[i + ^$:ih:Nfr)HP 

= 1 + - V fp'2~^ + ei^ ] . (16) 



Proof: The average noise plus interference aI*^^"* can be derived based on the distribution 
of the interference term and following the approach in [18], which gives 

|2 



E|h:Mff^)[n]| 
=E UK[n - l]fi«^)[n]f + E | e; [n] fi«^) [n] 



=p^2 ^-^+e^,,. 

The upper bound is obtained by following the approach in [6], [8]. ■ 
Remark 1: From (fT6l) . we see that residual interference/rate loss depends on delay, codebook 
size, Nt, and M. It increases with delay, and decreases with codebook size. It also increases 
with P, which makes the system interference-limited at high SNR. With other parameters fixed, 
the residual interference increases as M increases, which means it may not be desirable to serve 
too many users. 

The bound analysis in (fTSl) provides helpful insights on the effects of different system param- 
eters on the rate loss, but it is not accurate enough for mode selection. To accurately characterize 
the achievable rate, we derive the closed-form approximation for the MU mode in the following 
theorem. 

Theorem 2 (Average achievable throughput): The average achievable rate for the u-th user 
in the MU system of mode M (M > 1) with both delay and channel quantization can be 
approximated by 

Nt-L-l 2 L-1 i (j)(jj_u\\ /I \ 

i=0 j=l k=0 1=0 ^' \ j J 

where a = 6i = 62 = L = M - 1, af ^ and af ^ are given in ^ and 

and - , - , ■) is the integral given in (l30l) in Appendix lAl 

Proof: See Appendix |Al ■ 
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Remark 2: To calculate (flTl) for a given user, we need only information about its correlation 
coefficient (p^ = 1 — e^^), codebook size (B), and average SNR (P). Such information is 
normally fixed or changes slowly. Each user can feed back and update its own information, 
and then the BS can calculate the achievable rate. Instead, each user may also calculate the 
achievable rate and feeds back the preferred mode index. Note that the calculation and the mode 
selection is only done when the parameter changes, such as path loss change due to mobility. 

As special cases, the average achievable rate for the system with delay or limited feedback is 
provided as follows. 

Corollary 1: The average achievable rate for the u-th user in the delayed/limited-feedback 
system of mode M (M > 1) can be approximated by 

^t—L—l i /Til l\ L+l—i / 1 \ 

fl„(M)«loMe) E + (18) 

where L = M-1, a = ^, /3 = ^i^, with 

with delay only 

(19) 

1 — 5 with limited feedback only. 

and •, ■, •) is given in (1301) in Appendix lAl 

Proof: See Appendix |Bl ■ 
In the following, we provide high SNR approximations for MU modes that can be used to 
analyze the performance in the interference-limited region. 

Theorem 3 (High SNR approximation): The average achievable rate for the u-th user in the 
system with both delay and channel quantization in the MU mode Af > 1 at high SNR is 
approximated as 

Nt-L-l 2 L-1 (n.k+li 



'2 

1=0 j=l k=0 



where a = pi, a[^^ and af'^ are given in (|26l) and (|T7l) with 6i = p^d, 62 
and /2(-, - , ■) is the integral 




loia.m.n) = / -dx, (21) 

for which a closed-form expression can be found in [28, Sec. 3.8]. 

Proof: See Appendix O ■ 
The high SNR result for the system with delay or limited feedback is provided in the following 
corollary. 
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Corollary 2: The achievable sum rate for the tt-th user in the delayed/limited feedback system 
in MU mode M > 1 at high SNR is approximated as 

R\SM)^\og^{e) ( ■ )«^/2(«,^,i: + ^), (22) 

1=0 ^ ^ 



where a. = for delayed system, and a = ^ for limited feedback system, L = M — 1, and 



/2(-, -, ■) is given in (|2T1) . 

Proof: Following the steps in Appendix O with 5i =0. ■ 
Based on (fTTI) and the approximation for the beamforming system in [18], the active mode that 
achieves the highest average throughput in the system with both delay and channel quantization 
is selected according to 

M* = arg ^ max^ Rqd{M), (23) 

where Rqd{M) = J2u=i RqdAM), with RqdAM) given in 

Remark 3: Considering (fT6l) . (fT3l) . and (fTTI) . the mode M is now related to residual inter- 
ference, transmit array gain, and spatial division multiplexing gain. The idea of MMT is to 
balance between these effects to maximize the system throughput. The mode selection is based 
on fixed system parameters - the number of transmit antennas and the codebook size - and slow 
time-varying channel information - average SNR and normalized Doppler frequency. 

To show the accuracy of the derived approximations, numerical results are provided in Fig. [T] 
for different modes, with Nt = 4:, B = 18 bits, v = 10 km/hr, and Tg = 1 msec. We see that the 
approximation is very accurate at low to medium SNRs. At high SNR, when the sum rate of the 
MU mode saturates, the approximation becomes a lower bound, and the accuracy decreases as 
M increases. Interestingly, we see that the mode M = 3 always provides a higher throughput 
than the full MU mode M = 4. This is due to the fact that the full mode has the highest level of 
residual interference, as shown in (fT6l) . while it provides no array gain. Therefore, it is desirable 
to serve fewer than Nt users. We see that MMT is able to provide a throughput gain around 2 
bps/Hz over the dual-mode switching [18] at medium SNR. 

It is easy to verify that the high SNR result (|20l) matches the approximation, and can predict 
the behavior in the interference-limited region. In Fig. [2l we plot M* = argmaxAf>i Rq^{M) 
for different normalized Doppler frequency /^T^, i.e. the mode with the highest sum rate in the 
interference-limited region. We see that M* is different for different /^T^. For B = 10 bits, the 
mode M = 2 always has the highest throughput in the considered fdTg range, as it provides a 
higher array gain and has lower residual interference than the higher modes; for B = 20 bits, 
as the CSIT accuracy is improved compared to i? = 10 bits, M = 3 has the highest rate at high 
SNR when fdTg is small, but M = 4 still has a lower throughput; the highest mode M = 4 
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provides the highest throughput only for B = 30 bits and very small /(^T^ (< 10 So with 
both delay and channel quantization, the highest mode M = iV^ is normally not preferable. 

IV. Multi-mode Transmission with Round-robin Scheduling 

For the MIMO-BC with linear precoding, the number of users that can be supported simulta- 
neously is constrained by the number of transmit antennas, so we need to select and transmit to 
a subset of users in each time slot since typically U ^ Nt. Based on the throughput analysis in 
Section Unl MMT can be easily combined with round-robin scheduling, where the scheduling 
is based on the selected transmission mode but not on the instantaneous CSI feedback. This 
scheduling algorithm is of low complexity, only requires instantaneous CSI feedback from a 
small number of users, and provides temporal fairness. It is suitable for delay sensitive services 
or can be applied when it is possible to get feedback only from few users. 

We assume that the BS has knowledge of the average received SNR and the normalized 
Doppler frequency of each user, which can be fed back from users and do not require frequent 
update as they change slowly. Based on such information, the BS can estimate the average 
achievable throughput for each user for a given subset of selected users based on (flTl) . All the 
users are indexed and ordered as user 1,2, ... ,U. Scheduling starts from the head of the user 
queue. Once a user is scheduled, it is moved to the tail of the user queue. In the given time 
slot, denote Uh as the index of the user at the head of the user queue, and S as the subset of 
the selected users. The scheduling algorithm is given in Table H] and also illustrated in Fig. [3l It 
is similar to the conventional round-robin scheduling and serves users in circular order, but in 
each time slot multiple users rather than a single user may be served. MMT is applied in each 
time slot, where the number of selected users is determined based on the analysis in Section |nil 
Such user scheduling is denoted as Multi-Mode Transmission (MMT) based scheduling. 

Remark 4: Note that for the homogeneous network where all the users have the same average 
SNR, normalized Doppler frequency, and number of feedback bits, the number of scheduled 
users in each time slot is fixed, denoted as M*. The value of M* is determined from (|23l) . and 
then M* users are selected from the head of the user queue. This is the main scenario considered 
in the paper, but the scheduling algorithm can be applied to heterogeneous networks as well. 

Compared to opportunistic scheduling based on instantaneous CSI feedback, the proposed 
algorithm greatly reduces the amount of CSI feedback, as in each time slot only the selected M 
users (M <^ U) need to feed back their instantaneous CSI for the precoder design. In addition, as 
will be shown in numerical results, it provides performance close to the one with instantaneous 
CSI feedback from a large number of users. The proposed algorithm is suitable for other systems 



14 



with scheduling independent of the channel status, such as random selection or the ones based 
on the queue length. 

V. Numerical Results 

This section presents numerical results to demonstrate the performance of our proposed 
transmission strategy and to provide design guidelines in practical systems. We focus on a 
homogeneous network where all the users have i.i.d. channels. The number of transmit antennas 
at the BS is Nt = 4, which is the value currently implemented in broadband wireless standards 
such as 3GPP LTE. 



A. Operating Region and Throughput Gain of Multi-mode Transmission 

In this section, we consider parameters used in the 3GPP LTE (Long Term Evolution) standard 
[29], [30]. The Advanced Wireless Services (AWS) spectrum, which is one of the prime candi- 
dates for initial LTE deployment in the US, is considered, i.e. the carrier frequency is /c = 2.1 
GHz. In LTE, the minimum size of radio resource that can be allocated in the time domain is 
one subframe of 1 msec. Considering the propagation and processing time, the typical CSIT 
delay in the FDD mode is five subframes, i.e. the delay is r = 5 msec. 

1 ) Operating Regions: Based on the results in Section |llll the preferred mode M* can be 
determined for a given scenario. Accordingly, the operating regions for different modes can be 
plotted for different system parameters. Fig. |4(a)| and Fig. |4(b)| show the operating regions for the 
system with both delay and channel quantization, for different mobility v and different feedback 
bits B, respectively, where each mark on the figure denotes the type of the active mode. There 
are several key observations: 

1) For the given v and B, the SU mode (M = 1) will be active at both low and high SNRs, 
due to its array gain and the robustness to imperfect CSIT, respectively. 

2) For MU modes to be active, v needs to be small while B needs to be large. Specifically, 
to activate M = 2, we need v < 12 km/hr with B = 15 bits as in Fig. |4(a)[ and need 
B > 6 bits with v = 5 km/hr as in Fig. |4(b)[ to activate M = 3, we need v < 6 km/hr 
with B = 15 bits and B > lA bits with v = 5 km/hr. Note that in LTE each user only 
feeds back 4 bits to indicate its channel direction. 

3) The full MU mode M = Nt is not active at all with the considered parameters, as it suffers 
from the highest residual interference and does not provide array gain. 

2) SU-MIMO vs. MMT: In Fig. |5l we compare MMT with ZF precoding (MMT-ZF) and 
the single-user beamforming (SU-BF) transmission, both with round-robin scheduling, and with 
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fc = 2.1 GHz, r = 5 msec, and v = 5 km/hr. We see that for 5 = 4 the curves of MMT-ZF 
and SU-BF overlap, which means no MU mode is activated. This confirms the result in Fig. 
[4(b)| For B = 8, MMT-ZF provides throughput gain over SU-BF for SNR in ~ 18 dB. For 
B = 12, MMT-ZF provides throughput gain for SNR in -5 ~ 25 dB, which is larger than 20% 
for SNR=10 ~ 20 dB. 

3) Delay vs. Quantization Error: From the analysis of residual interference terms in Section 
IIII-B[ we can determine for a given scenario which effect dominates, delay-mobility or channel 
quantization error. Each interference term due to delay-mobility has variance cr^ = eg while 
each interference term due to channel quantization error has variance ctq = p'^2~ ^t-^ . For 
example, for v = h km/hr, fc = 2.1 GHz, and r = 5 msec, we have ajj = 0.0458, and 
cr^ = 0.3787 with 5 = 4, = 0.1503 with B = 8, so aj^ <^ cr"^ in these scenarios and the 
channel quantization error dominates the performance, as shown in Fig. [51 to get ajy ^ aq, we 
need B = 13, but a too large B will not help much as delay-mobility will start to dominate. 
Fig. [6] shows the performance of ZF-MMT with different values of B. We see that the sum rate 
cannot be further improved once B is sufficiently large (B > 20). 

Remark 5: The numerical results in this section provide the following insights: 

1) As shown in Fig. |4(a)[ for a given delay and give B, the mobility plays a significant role, 



and MU-MIMO should only be used with low mobility (< 12 km/hr). 

2) Both Fig. |4(b)| and Fig. [5] show that the number of feedback bits in LTE {B = 4) is not 
large enough for MU-MIMO and should be increased (B > 8). 

3) As CSI feedback occurs only in certain subframes, the delay in available CSIT is related 
to the radio frame length. Therefore, it is expected that the MU-MIMO is more applicable 
in the UTE system which has a shorter frame length (1 msec) than the WiMAX system (5 
msec). 

4) Reducing quantization error by increasing B is not enough to fully exploit the throughput 
gain of MU-MIMO, and other CSIT imperfections should be taken into consideration. 

B. ZF vs. MMSE Precoding 

MMSE precoding, or regularized ZF precoding, can increase the throughput at low SNR com- 
pared to ZF precoding [31]. Fig. |7] compares the sum rates of MMT-ZF and MMSE precoding. 
For the perfect CSIT case, the number of active users for MMSE precoding is fixed to be f/ = Nt, 
as little gain can be achieved by varying the user number. We see that the sum rates of the two 
systems are very close. This means that MMT improves the performance of ZF precoding and 
approaches that of MMSE precoding. With imperfect CSIT, simulation results for different modes 
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with MMSE preceding are plotted, showing that mode switching is also required to improve the 
spectral efficiency for MMSE precoding. If MMT is applied for both systems, we see that the 
performance of ZF precoding approaches that of MMSE precoding. Note that MMSE precoding 
requires instantaneous CSI feedback from all the user, while the number of users that need to 
feed back instantaneous CSI for ZF precoding depends on the active mode and normally is less 
than Nf. In addition, as MMSE precoding is difficult to analyze, the preferred mode cannot be 
easily determined. Therefore, employing MMT, ZF precoding is preferred to MMSE precoding. 

C. MMT Under a Feedback Overload Constraint 

In this section, we consider the scenario with a constraint on the total feedback overhead, i.e. 
the total number of feedback bits from all the users is fixed to be Bt [32]. We compare the 
proposed MMT-ZF with round-robin scheduling and the ZF precoding with opportunistic user 
selection (US-ZF) that is based on instantaneous CSI feedback. As shown in [32], with a feedback 
overhead constraint it is more desirable to get high-rate/high-quality feedback from a small 
number of users than low-rate/course feedback from a large number of users, and the optimal 
number of feedback bits from each user and the number of active users can be determined. We 
focus on the impact of limited feedback, and delay is not considered in this section. For MMT- 
ZF, the number of feedback bits for each user is for mode M. For US-ZF, the optimal 
feedback bits for each user, Btja, is obtained throughput simulatiorS, and I -^f - I users feed back 
their instantaneous CSI for scheduling. 

Fig. [8] shows the performance of MMT-ZF and US-ZF, together with the PU^RC (Per Unitary 
basis stream User and Rate Control) [33] and single-user beamforming (SU-BF) with channel- 
dependent maximum rate scheduling. We see that MMT-ZF almost always performs better than 
both PU^RC and SU-BF, and its performance is close to US-ZF. At low SNR, PU^RC provides 
slightly higher throughput than MMT-ZF as it provides multiuser diversity, which is a kind of 
power gain and dominates the performance at low SNR. For a given SNR value, when Bt keeps 
increasing, the throughput of US-ZF increases due to multiuser diversity, while the sum rate of 
MMT-ZF saturates as the full mode M = Nt is activated and there is limited performance gain 
to further improve the accuracy of CSIT. 

''in [32], an approximation was derived to solve for B^g (eq. (11)). However, it is based on the rate loss derivation, and 
is not accurate especially for medium to high SNR values. The inaccuracy of the rate loss based result at high SNR was also 
shown in [18]. 
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VI. Conclusions 

In this paper, we propose a multi-mode transmission strategy that adaptively adjusts the number 
of active users based on the average achievable rate. Considering transmit array gain, spatial 
division multiplexing gain and residual inter-user interference, multi-mode transmission improves 
the spectral efficiency over SU-MIMO at medium SNR. It is shown that the full mode M* = Nt 
will normally not be activated as it has the highest residual interference and no array gain. 
Multi-mode transmission can be combined with round-robin scheduling to serve a large number 
of users, which selects users based on average SNR, codebook size, and normalized Doppler 
frequency. The proposed algorithm significantly reduces the feedback amount, and provides 
throughput close to the one based on instantaneous CSI feedback when there is a total feedback 
overhead constraint. The analysis and numerical results provide insights and design guidelines 
that are of practical importance. 

Appendix 

A. Proof of Theorem |2] 

Assuming interference terms are independent, and independent of the signal term, '^^zfI ^^"^ 
be approximated as 

1 + 5,y, + 5,y, " ^' ^^^^ 

where a = ^,5i = 82 = yi ~ xIl^ 2/2 ~ xIl^ z ~ xl{Nt-Ly L = M - 1, and yi, 
2/2, z are independent of each other. 

Let y = 6iyi + 52y2, then the pdf of y, which is the sum of two independent chi-square random 
variables, is given as [34] 

py{y) = e-y''^ a!py^ + e-/^^ = E E -^S'^-'"^ , (25) 

i=0 i=0 j=l k=0 



where 



(1) ^ 1 ( 5, \ ' {2{L-l)-^)\ ( 62 ^ ^-^-^ 
' 5l+\L-l)\\6i-62j i\{L-l-t)\ \62-6 

(2) ^ 1 / 62 y (2(L-l)-^)! / 6, ^ ^-'^^ 



(26) 
(27) 
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We first derive tlie cumulative distribution function (cdf) of the random variable x as follows. 



Fx(x) = P 



az 



1 + y 



< X 



Fz\Y (^{l + y)) PY{y)dy 



^'i- ""'j:" ^^^^tt^) p^^y^^y 



l-e 



-x/a 



e 







Nt-L-l 
i=0 



i 2 L-1 



j=l k=0 



Nt-L-l 2 L-1 (j) 



i=0 j=l fc=0 

A^t-L-l 2 L-1 i (j) 



/=0 



exp 



X 1 

a 0. 



y'^-'dy 



j=0 j=l fc=0 /=0 



l+k+1 



(28) 



where step (a) follows binomial expansion of {y+l)\ and step (b) follows the equality y'^^e 

The expectation of ln(l + a;) on x is derived as follows. 
Ex [ln(l + X)] = ln(l + x)dFx = ^T^^^^ 



Nt-L-l 2 L-1 i ... 

yv ^ ^ ^ '(/ + Kj! a 



i+fc— i+lg-x/ct^i 



(b) 



i=0 j=l k=0 1=0 
Nt-L-l 2 L-1 i (j) 



X + 



l+k+1 



-dx 



E EEE^^""-^^.^.M+*+i). (29) 

i=0 7=1 jt=0 i=0 ^' \ J / 



where step (a) follows integration by parts and Ji 

/i(a, 6, m, ra) 



in step (b) is given by 

dx. 



(30) 



for which a closed-form expression can be found in [28, Sec. 3.8]. Then the achievable sum rate 
for the M-th user in mode M can be approximated as 

Rqd,u{M) = E, [log2(l + TgJI'j)] ^ log2(e) ■ Ex [ln(l + X)] , 

which gives (fTTI) . 



B. Proof of Corollary [7] 

For the system with delay, similar to (fT3l) . the received signal power for the w-th user is 
approximated as 



Pf)^-|p„h:[n-l]f.[n] 



(31) 
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Similar to (|24l) . the received SINR can be approximated by 

(D) _ tt^ A 

where a = /3 = y ~ xIl^ ^ ~ xl(Nt-Ly L = M - 1, and y, z are independent of 

each other. 

For the system with limited feedback, similar to (fT3l) . we get the received signal power as 



= ^IIKHf ■ |(cos^^„)h:HfW)M + (sin^.)g:[n]fP[n]'2 
(a) P 



-^l|h«Mir-|(cose„)h:[n]fl«)Hr 
S ^||h„Nf .E^lcos^^] . |h;[^]f(«)[n]|2, (33) 

where in step (a) we remove the term with sin 9u which is small for a large B, and in step (b) 
we approximate cos6'^ by Eg [cos 6'^]. As shown in [6], Eg [cos 6*^] is well approximated by 1 — 5, 

B 

with 5 = 2 ^t-i. Then the received SEvfR for the w-th user is approximated by 

(Q) «^ A .34. 

where a = -^^^f^, /? = y ~ xIl^ ^ ~ X2(Nt-L)^ L = M — 1, and y, z are independent of 
each other. 

From (|32|) and (l34l) . we can get the result in (fTSl) following the steps in Appendix lAl 



C. Proof of Theorem |21 

From (l24l) . the received SINR for the n-th user at high SNR is approximated by 

(D) OLZ 



hy\ + hy2 

where a = pi. Si = plS, 4 = el^, z ~ X2(iv,-L)' ~ xL, 2/i ~ xL' = M - 1, and z, ?/i 
and 7/2 are independent. 

Denote x = - — — , following the same steps as in Appendix |Al the cdf of x is derived as 

oiyi+S2y2 

F,(x) = i- ^ EE ^ ^-..m - (35) 

i=o j=i fc=o ■ ( ^ + r 



Then following the steps in (|29l) we get the result in (1201) . 
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TABLE I 
Scheduling Algorithm 

1) Initially, set S = {uh}, Rold = RQD{uh), u = Uh- 

2) While \u-Uh\<Nt 

a) 5 = 5 + {u + 1 mod U}. 

b) Calculate 7?new = I^5g5 -RQi3,s(>5). 

C) If Rnew > Rold, SCt Rold = Rnew, S = S 

d) Set w = w + 1 mod U. 

3) Let Uh = Uh + \S\ mod U. 
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Fig. 1. Simulation results and approximations for different M, Nt = 4, v = 10 km/hr, Ts = 1 msec, B = 18 bits. 
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Fig. 2. The MU mode with the highest rate ceiling for different fdTs, with Nt = 4. 
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Fig. 3. Illustration of multi-mode transmission (MMT) based round-robin scheduling. 
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Fig. 4. Operating regions for different modes with both CSI delay and channel quantization, Nt — 4. The mode M = i means 
that there are i active users. In this plot, 'x' is for AI = 1, 'o' is for M — 2, '+' is for Al — 3, and is for AI = 4. Note 
that the highest mode M = 4 is never activated in both figures. 
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Fig. 5. Simulation results of SU-BF and MMT-ZF with different B, Nt = 4, fc = 



2.1 GHz, T = 5 msec, and ?; = 5 km/hr. 



24 



16 




SNR (dB) 



Fig. 6. Performance of MMT-ZF with different B, Nt = 4, /c = 2.1 GHz, r = 5 msec, and v = 5 km/hr. 
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Fig. 7. Simulation results of MMSE preceding and MMT-ZF systems, Nt = 4. For imperfect CSIT, S = 18 bits, /c = 2.1 
GHz, r = 5 msec, and v = 5 km/hr. 
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(a) Sum rates of different systems, T = 100 



(b) Sum rates for different Bt, SNR=15 dB. 



Fig. 8. Sum rates of different systems with a feedback overhead constraint. The curve of MMT-ZF is not smooth for different 
Bt due to the roundoff of [^J . 



